Goto

Collaborating Authors

 user conversation


Hundreds of thousands of Grok chats exposed in Google results

BBC News

The appearance of Grok chats in search engine results was first reported by tech industry publication Forbes, which counted more than 370,000 user conversations on Google. Among chat transcripts seen by the BBC were examples of Musk's chatbot being asked to create a secure password, provide meal plans for weight loss and answer detailed questions about medical conditions. Some indexed transcripts also showed users' attempts to test the limits on what Grok would say or do. In one example seen by the BBC, the chatbot provided detailed instructions on how to make a Class A drug in a lab. It is not the first time that peoples' conversations with AI chatbots have appeared more widely than they perhaps initially realised when using "share" functions. OpenAI recently rowed back an "experiment" which saw ChatGPT conversations appear in search engine results when shared by users.


How to run an LLM on your laptop

MIT Technology Review

Getting into local models takes a bit more effort than, say, navigating to ChatGPT's online interface. But the very accessibility of a tool like ChatGPT comes with a cost. "It's the classic adage: If something's free, you're the product," says Elizabeth Seger, the director of digital policy at Demos, a London-based think tank. OpenAI, which offers both paid and free tiers, trains its models on users' chats by default. It's not too difficult to opt out of this training, and it also used to be possible to remove your chat data from OpenAI's systems entirely, until a recent legal decision in the New York Times' ongoing lawsuit against OpenAI required the company to maintain all user conversations with ChatGPT.


On the Way to LLM Personalization: Learning to Remember User Conversations

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have quickly become an invaluable assistant for a variety of tasks. However, their effectiveness is constrained by their ability to tailor responses to human preferences and behaviors via personalization. Prior work in LLM personalization has largely focused on style transfer or incorporating small factoids about the user, as knowledge injection remains an open challenge. In this paper, we explore injecting knowledge of prior conversations into LLMs to enable future work on less redundant, personalized conversations. We identify two real-world constraints: (1) conversations are sequential in time and must be treated as such during training, and (2) per-user personalization is only viable in parameter-efficient settings. To this aim, we propose PLUM, a pipeline performing data augmentation for up-sampling conversations as question-answer pairs, that are then used to finetune a low-rank adaptation adapter with a weighted cross entropy loss. Even in this first exploration of the problem, we perform competitively with baselines such as RAG, attaining an accuracy of 81.5% across 100 conversations.


A First Look At Efficient And Secure On-Device LLM Inference Against KV Leakage

arXiv.org Artificial Intelligence

Running LLMs on end devices has garnered significant attention recently due to their advantages in privacy preservation. With the advent of lightweight LLM models and specially designed GPUs, on-device LLM inference has achieved the necessary accuracy and performance metrics. However, we have identified that LLM inference on GPUs can leak privacy-sensitive intermediate information, specifically the KV pairs. An attacker could exploit these KV pairs to reconstruct the entire user conversation, leading to significant vulnerabilities. Existing solutions, such as Fully Homomorphic Encryption (FHE) and Trusted Execution Environments (TEE), are either too computation-intensive or resource-limited. To address these issues, we designed KV-Shield, which operates in two phases. In the initialization phase, it permutes the weight matrices so that all KV pairs are correspondingly permuted. During the runtime phase, the attention vector is inversely permuted to ensure the correctness of the layer output. All permutation-related operations are executed within the TEE, ensuring that insecure GPUs cannot access the original KV pairs, thus preventing conversation reconstruction. Finally, we theoretically analyze the correctness of KV-Shield, along with its advantages and overhead.


SysBench: Can Large Language Models Follow System Messages?

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have become instrumental across various applications, with the customization of these models to specific scenarios becoming increasingly critical. System message, a fundamental component of LLMs, is consist of carefully crafted instructions that guide the behavior of model to meet intended goals. Despite the recognized potential of system messages to optimize AI-driven solutions, there is a notable absence of a comprehensive benchmark for evaluating how well different LLMs follow these system messages. To fill this gap, we introduce SysBench, a benchmark that systematically analyzes system message following ability in terms of three challenging aspects: constraint complexity, instruction misalignment and multi-turn stability. In order to enable effective evaluation, SysBench constructs multi-turn user conversations covering various interaction relationships, based on six common types of constraints from system messages in real-world scenarios. Our dataset contains 500 system messages from various domains, each paired with 5 turns of user conversations, which have been manually formulated and checked to guarantee high quality. SysBench provides extensive evaluation across various LLMs, measuring their ability to follow specified constraints given in system messages. The results highlight both the strengths and weaknesses of existing models, offering key insights and directions for future research. The open source library SysBench is available at https://github.com/PKU-Baichuan-MLSystemLab/SysBench.


Rasa-X Is A Unique Approach To Continuous Chatbot Improvement

#artificialintelligence

Landscapers sometimes accommodate desire paths by paving them, thereby integrating them into the official path network rather than blocking them. The image above is of an desire path being blocked and rehabilitated in an attempt to force users on the designed path. Sometimes, land planners have deliberately left land fully or partially unpathed, waiting to see what desire paths are created, and then paving those. In Finland, planners are known to visit parks immediately after the first snowfall, when the existing paths are not visible. The naturally chosen desire paths, marked by footprints, can then be used to guide the routing of new purpose-built paths.


The subtle art of chatbot development- Client Requirements versus Client Expectations

#artificialintelligence

Chatbots are virtual agents capable of emulating the conversation of a human. Chatbots are becoming very popular in providing online services or queries. Recently, chatbots have been gaining lots of limelight due to the development of Natural Language Processing (NLP) capabilities. Today, a chatbot can respond in a similar way a human agent would. Being devoid of emotions, a chatbot can be expected to offer the same quality of service throughout the day.


Google admits to listening in on private conversations via Assistant

#artificialintelligence

Google has reportedly admitted that Google employees listen to private recordings of customer conversations via Google Assistant. Moreover, employees are able to access conversations which were not meant to be recorded. Leak of 1,000 private conversations in Dutch language by some of Google's partners to a Belgian news site further proved that third-party contractors working for Google were also able to access these multiple sensitive user conversations, that were reportedly recorded unintentionally. Usually, users with Google Assistant on their phones and smart speakers have to say "Ok, Google" to start a conversation with the AI-powered virtual assistant. But even when users didn't call up the virtual assistant, various user conversations that were personal and sensitive in nature were recorded.